SOCOHin lESIIE 



XH 005 390 



IDB SITZ 
■OIE 



ZDIS -PUCE 
SZSCIIPIO^ 



IDZITZFISBS 



BrigaaB, S« Xeellen; Baslimv, v« L« 

loltiple !r«5t Zgaa^sg Vsxug tke lascli Boflel. 

£ipr 76] 

34p«; Paper presented at tie lanaal Hcetlng tke 
laerican ZdiicatioAal lesetrck Isscciatioo (SOtli, San 
Francisco, California, Ipril 19-23, 1976) 

HF-Sa«83 BC-42«06 Pins Postage* 

Ibilitj; IchieT^nent *Xests; Conpnter Prograns; 

^Zgnated Scores; ^ten Inaljsis; ^Batkeaatical 

Models; Matrices; fieasnrenent Tecknignes; 

ProbaLilitj; Standard Zrror of fieasnrenent; 

^Statistical Analysis 

^lasch 'flodel; ^est Zgnating 



IB^IBICT . - 

Procedures are presented for egiatlng sinnltaneoislj 
several tests vhicli have Jbeen^Kralibrated bj tbe Basck Model* Tkree 
nnltiple test egnating designs, are described, k- Pall Matrix Design 
eguates eacJb test to all ot&ers* 1 Chain Design links tests 
segaentially* 1 Tector Design- egnates one test to eack of tke otker 
tests* Por each design, tke Sasck nodel ttet egpafing constants veje 
obtained for four reading Tocabtilarj. tests* Sie standard errors^^of* 
tke constants based on eack design are also^ provided 'and tke 
appropriate nse of eack design .is discnssed^^{lntkor/DZP] 



^ Docnaents acgiiired bj zilC include nanj infornal anpoblisked ' 
^ naterialM not available f rdn> otker soarces. mc nates every effort 
^ to obtain tke best copy available* levertkeless, iteas of Marginal 
^ reprodacibility are often eac^oqntered and tkis affects tke^gaality 
^ of tke nicrof icke and kardQopjr ^reprodnqtions ZIIC nakes available 
^ via tke ZIIC Docanent Beprod notion Service (ZDBSJ • BDBS Im Mot - 
^ responsible for tke goality of tke original docnneAt* Beprddlu:tidkM 
^ sipplied by BDBS are tke b^st tkat can be nade fccB tke original* * 



MULTIPLE TEST EQUATING USIIIS THE RASCH MODEL 



S. Leellen Brignan 
Indiana University 

- 5J. L. Bashaw-.* 
Universi-ty of Georgia 



American Educational Research Association 
Annual Meeting 
April 1976 
- San Francisco, California 



Abstract 



This paper presents procedures for equating simul- 
taneously several tests which have been calibrated by 
the Rasch Model. Three multiple test equating designs 
are described. A Full Matrix Design equates each test 
to ^11 others. A Chain Design links tests sequentially 
A Vector Design equates one test to each of the other 
tests. For each design, the Rasch model test equating^ 
constants were obtained for four reading vocabulary 
tests. The standard errors of the constants based on 
each design are also provided and the appropriate use 
of each design is discussed. 
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Angoff -(1971)' has stated that for test scores to ^ 
be meaningful, the instruments of ineasur€:nent nust neet 
three requirements. Firsts an appropriate scale struc- 
ture aust be defined so that the scores may be ccmmuni- 
cated, i.e., the scaling process. The second requirement 
is that special norms or inte3?pretive guides must be 
prepared for the user of the scores,, i.e* , the process 
of nonnlng r The third requirement for a test score to 
be meaningful is th^t provisions be made for the mainte- 
nance and perpetuation of the scale on vhich the original 
test scores are reported, i.e., the process of equating 
or calibration . 

A general definition of .test equating is a psyche?- 
metric process which converts^he system of iinits of one 
test to the. system of \inits of a second test such that^ 
the scores derived from the two tests after conversion 
will be directly equivalent. Two restrictions a3?ev 
implied by this definition: (1) the measures (tests or 
forms) must measure the same characteristic, and (2) the 
conversion must be unique', a transformation of the system 



of ^units only, except fop randca eirrors' associated with 

the -aixr^^ia^ of the data and the errors -associated 

with the rethod used for detemining the transfomation- 

The second 3?estyiction indicates that the resulting 

conversion should 2?e ijidependent of 'the persons fron 

whon the data were obtained to develop xhe conversion; 

and thus, the conversion should be freely applicable to 

all situations, i.e., saSnole-free test calibration. 

-/ Angoff (1971) has made one of the few definitive 

efforts to p3?ovide those in psychoisetrics with a discussion 

of -equating versus calibration of tests, a discussion of 

the equipercentile and linear models for test equating 

and/or calibration, and sampling designs for the equating 

of two tests or the calibration of a test to a reference 

■ • * 

• scale • Most practical applications of test equating 

have involved the equating of fwo tests or the caliBration. 
of one test to a reference scale. It is obvious why 
this is true when one reviews the complexities of the 
sampling design and the procedures associated wijh equi- 
percentile cind linear equating of multiple tests within 
one study as evidenced in the Educational Testing 
Service's Anchor Test' Study; Final, Report (1372). ^ 
A third, model, the simple logistic mocjel or Rasch 
jnodel may also be used to equate or calibrate tests* The 



ilasch ncdel provides the researcher with a snatheriatical 
nodel that reduces the coriplexities of the sampling 
designs and equating or calibi^ation procedures, especially 
when equating multiple tests Sji-^one study. 

In 19S9 Panchapakesaii successfully applied the. 
sinple logistic model (Kasch^ model) to the problem- of 
equating linked test foms and tests administered to 
matched samples- To equate scores on two tests, 
Panchapak^san estimated a constant which represented the 
difference in the origins of the scales of the two tests • 
This additive constant could be used to equate the scores 
on one*test to the scores on the secoxid test. The use 
of the Rasch model for equating multiple tests in one 
study was not attempted until 1973 in the federally 
funded ilasch Project (Rentz, Bashaw, Cartledge, and 
Brigman, 1975). 

The purpose of the present study was to develop and 



illustrate the procedure for obtaining Rasqh mode^ tfest 
equating constants for. three multiple test elating * 
designs. The three designs included in this research 
differed in the number of independent samples smd the . 
ntunber of combijiations of tests used to determne the 
Rasch model test equating constants. Each # of the design^ 
requires a different manipulation of the data to obtain 
the set of Rasch model test equating constants. 



f ' Hasch's Structural Hodel xor Items of a Test 



Georg Rasch is a Danish nathenatician wlio has been 
instrumental in the development and investigation of 
nathenatical foundations for "objective neasurenent" 
especially in the domains of educational and psychologi- 
cal testing* Rasch (1986a) has stated that "specific 
objectivity" exists when: 

The comparison of any two subjects can 
be carried out in such a ^ay that no 
* other parameters are involved than those 

of the'two subjects (?* 104) and t7hen 

any two stimuli can be compsired indend- 
ently of all other parameters- than those 
of the two stimuli- (p. 105) 

In Rasch' s 1960 book. Probabilistic Models for Some 

£? . ■ ^ 

intelligence arid Attainment Tests , he presented a 

detailed discussion of three "models f or 'ineasuring" . 

Rasch (1961) stated that: ^ * 

Each model specifies a distribution 
function for" the potential respon? s ' 
of a given person to a given stimulus 
of a certain set of allied stimuli /and 
this distribution, function dependsr^ upon 
a parameter characterizing the person and ' ^ 
a parameter characterizing the stimulus* 
(p*. 32i) 

An important property of these models when analyzing 
data is. the ability to detach the person parameters from 
the stimulus parameters and visa versa. # 

In the field of educational and psychological 
measurement, "A Structural Model for Items of a. Test" 

s' ' 6 



has !>ecc2ie known as the "Rasch model". In Irhe develop- 
cent Ox "the Rasch model three assumptions were made. 
Rasch (19B6b) has listed the assumptions as. follows: 

(a) To each situation in which a subject 

(s=l,2 n) has to answer an itiem (i=l, 

2...m) there JLs a corresponding proba- 
bility of ^ correct answer (X^^^l) which 
we shall write in the form? 

A • 

Pr (X3. =1) = _fi^ , (X3. > 0). 

1 ^si ^ 

(b) The situation pararo'eter Ag^ is -the 
product of two factors 

^si = ^s. ''i 

# 

where %^ pertains to the subject and Wj[ 
to the item- 

(c) Given the values of the parameters, 
angers" are stochastically itidepend- 

ent. (p. SO) 

Rentz, Bashatjr, Cartledge and Brigman (1975) have 
defined three "antecedent^ conditions" which are necessary 
for model fit when analyzing data with the Rasch model. 
These conditions are implications of the assumptions of 
the model. The first condition is that the item pool to 
be analyzed must be unidimensional. The second anteced- 
ent condition ?s that of equal item discrimination; the 
rate of increase in the probability of passing an item 
'as the ability increas.es must be equal for' all items ♦ 
The third antecedent condition is that guessing must be 
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absent or lainimal in the item responses to reduce the 
p3?obability of passing an iteni by chance. 

In a 1367 presentation to the Invitational Conference 
on Test Probleias, Eenjajnin VJright operationalized' and 
demonstrated the Rasch model's claims of objectivity. 
The two basic outcomes, or consequent conditions ^ of the 
Rasch model with which vfe»ight dealt in his presentation 
were: (1) the calibration of test items independent of 
the sample of subjects, and (2) th'e measurement of a 
person on the latent trait independent of the particular 
items used. 

Wright and Panchapakesan (1969) have desepibed 

» 

estimation techniques for the Rasch model parameters, 
and (!)£; where eoj^, the item parameters, are invariant 

over different samples, and TTg, the person ability para- 

/ 

meters, are invariant over samples and item sets. Assoc- 
iated with their work are computer programs which can 
be used fo pei*forra Rasch analyses of test^dat^; one 
program is commonly referred to as the MESAMAX program 
written by Wright and Skirmont (1972) and employed in 
the present study. Another is called CALTIT, written 
by Wright "and Head (1975). - * 



Equating Tests ^with Irhe Rasch Models ^ 

To employ tlie Rasch model for Equating tests , two 
general conditions must be meti^ (jj) the testa to be 
equated must be parallel, and (2) the tests must provide 
an acceptable fit with the model* ' Equated or equivalent 
scoresT when using the Rasch model can be defined as 
scores on. two tests whioh give rise to the same^ estimate 
of ability. ) 

When ilasch model analysis of an n-item test is 
performed 3 n Rasch item easiness estimates are obtained 
on a log easiness scale with a mean zero. The MESAMAX 
program provides .easiness estimates that are positive 
for the easier items and negative for the harder items. 



Also, for the n-item test, n 1 Rasch ability estimates 
will be., obtained for the raw scores on a scale of log, 
ability (ability estimates are not obtained for a raw 
score of z^ro or a jnaximum raw score of n) . The ability 
estimates are positive "for the higher scores and negative, 
for the lower scores. \ ' ^ 

The zero poiiit :on the log easiness scale is an 
arbitrary origin. The origin is fixed in, the computer 
program by setting the, mean item easiness to zero. This 
zero point simultaneously fixes the ?ero- ppint qfn the 
lo^ ability scale* ThusT^ the zero point is *arbitra)?y 
in the sense that it is defined by the set oi items that 
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are analyzed. Equating can be« considered as adjusting 

these arbitrary origins for sets of tests to a cojnmon 

origin. , . 

There are two methods of obtaining Rasch model test 

♦ 

equating -constants. The first method^ the item diffi- 

culty method, uses the Rasch model item parameter esti- 

mates as the initial values in the procedilres for ^ 

obtaining the constants in a test equating study. The 
* « 

second method, the ability method, uses the Rasch model 
ability parameter estimates as the initial values in the 
procedures for obtaining th^ constants ♦ 

For the itdm difficulty method, the two sets of 



item data for a jpair of t^sts are pooled' and calibrated 
-a&^ne-4^^sir^o£--jjr^^-n-^'-6ftems.^ Thus, the n^ + TI2 i't^ins 
are calibrated on a single scale of log easiness with a 
mean of zero. Since, as in the following examples two 
tests are admiilistered 'to the same group' of subjects', 
any difference in the average item easiness estimates 
of the two te*sts represents the difference in the^ scale 
origins of the two tests. This difference in the scale 
origins is an additive constant that may be used' to 
equate jthe two ability scales associated with the 
separate tests, i.e., it is a Rasch model equating 
constant. ' \ 

- - \ _ * - 
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For the *ability method of Rasch model q.quating, * 
* ejach test is analyzed independently. • Since the* two / 
|:ests .^s in the subsequent examples are a^inistered to 
1:he same group of subjeclfs, the average o£ the ability 
estimates of the two tests will be equal xf the stiale 
ori^j^ins of the two tests are the same. To obtain an 
estimate of the difference in the scale origins ah 
average Rasch model ability estimate is calculated ^fdr 
each test. .. The difference in the two. averages represents 
the difference in the origins of the scales of the two 
tests > As in the item difficulty method, the diff^ence> 
value is. an additive constant that may be used- to equate \ 
the two ^ilify scales associated with tHte individual 
tests, i^.e.5 it is a Rasch model equating constant. . 

> The Designs and Procedures ' ' . 

Multiple ^est equating is .defined as the ^process 'd'f 
simultaneous' equating of moipe than two tesf s or tlie , 

'process of .simultaneous calibx^tion of mqre t^ah fone 
\ ^ ■ * - • ' 

test to a reference sc^^Le". Multiple test equating designs 
may 'b4f seen as extensions of simple test equating designs 
and are defined by th*$ present authors as the, schemata 
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for the administration and data collection that are * 
required •to equatfe more, than two tests or calibrate more 



than one test to a refer.em^e scaisrin^a single stud^. , 



The present i»esearch focused on thr^e desigrfs ^that 

, - . . mm 

*may be employed in a multiple test equating study. The 
purpose and the desired^ proaulct of the equating study 
'di^ztate the choice of the, <le signs The designs reflect 
the* test data that mXist be collected and the steps in_. 
the equati'hg procedures to estimate the Rasch model 
equating constants . , * ^ ' - ^ ^ 

For the purpose of describing the three multiple 
test equating designs and the 'associated , procedures for 
obtaining the Rasch' model equating constants, a multiple 
test equating matrix was used.' Tfie symbols that are | 

- ■ ' " ' ' / • J 

used in the following discussion ar.e defined, in Tahle 1* 
in the Appendix. ^ ^ , • * ' 

For k tests, a multipli test equating matri'X is a 
k X k matrix." 'The .elements , or cells, of the matrix,. 
T..'s, represent all possible test pair combinations 
that could be administer^ed to independent group^ of ' 
subjects; For a cell in the multiple test equating 
matrix, the row index^ corresponds to^the test that wbuld 
be administered first to the^group of^ubjects and the ,^ 
column in4ex corresponds to the test that would be ' ' ^ 
admin?.stered second.' The diagonal cells in th^ multiple 
test equating matrix would represent two^ administrations 
of ojie test to a single group of subjects. Data for- the 
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diagonal cells nay or may not be collected in a study. 
The cellE. or *test pair conbinations, belcw the diagonal 
cf the r^ultiple lest equating jnatrix represent the 
countarkalpxiced testing orders of the test pair cos±>ina- 
tions atove the diagonal of the natrix. If the rese^^he 
selects not to collect data for the -diagonal cells of 
the r^ultiple test equating nctrixj the natrix consists 

- >: cells or test pair combinations as seen in 
Figure 1. 

In the Anchor Test Study and the'Rasch Project, 
one of the tests to be equated was selected as an anchor, 
or base, test. For convenience of describing the 
different nxultiple test equating designs and their 
as3£:ciated procedures for obtaining the final Rasch 

i 

ncdel test equating constants, the base te^t was always 
assigned to the first row and the first columg of the 
multiprle test equating riatrix* 
The Full Design for Multip3.e Test Equating 

The Full Design for multiple test equating is 
defined as jan equating design in which all test pair 
conibinations in the jnuitiple test equating matrix are 
administered* The dat^ obtained oil all test pairs are 
u$ed to estimate the final Rasch model test equating 
constants/ An illustration of the Full Design is the * : 

9' 
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trhe isulciple test equating isatrix (See Figure 1). The 
Ful^^Uasign with test-parallel fom eoabinations on the 
diagonal i^as used in the ATS and Rasch Project. * 

To estimate the final Rasch nodel te^t equating 
constants for the Full Design, ,the researcher nay use ^. 
either the itea difficulty nethod (it en easiness estimates) or 
the ability method (person -ability estimates) to obtain 
the initial estimates of the difference in the scale 
origins, of a test pair in a cell of the design. MESAMAX 
analyses are' performed for each of the cells in the Full 
Design.^ For each, test pair, the average of the test 
that was administered seconcf in the test pair is sub- ^ 
tracted from the' average of the test that was administered 
first in the test pair. These differences in the averages 
are the Rasch model cell equating constants, denoted by 
c^j , where i corresponds to thje index of the test that ^ 
was administered first in the test pair and j corresponds 
to the index of the test that* was a'dSiinistered second in 
-the test pair. The next step in the equating process is 
to organize the c - - * s into their appropriate cells in 
the ^multiple test equating matrix v;ith zeros inserted 
in the empty diagonal cells of the matrix. - 

To obtain a single Rasch model equating constant 
for each test in the Full Design, the sets of cell ^ 
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equating constants are first combined to yield marginal 
equating means « These means are obtained by summing the 
C£.*s in each row in the matrix and dividing by k Iro 
obtain the row marginal means ^ c^.'s^ and by suisming the 
C£ . * s in each column in trhe matrix and dividing by 1c 1:o 
obtain the column marginal means, c.j's. The txo laarginal 
means for a test correspojid to the order of test adminis- 
tration with the 3?ow marginal mean reflecting the effects 
of i.he test when it was administered first in a test pair 
and the corresponding column marginal mean reflecting the^ 
effects of the test when it was administered second in a 
test pair. 

The next step in irhe procedure is to combine 1:h,e 
marginal means for. a test in the Full Design. This is 
done by averaging the row marginal mean with its corresr 
ponding coliimn marginal mean for each test. To do this, 
th6 signs of the column marginal means are reversed. The 
average of the row and column marginal mean, (.c^^ c^j)/2, 
are the preliminary Rasch model equating constants, c^^Sy 
for? the Full Design. 

The last step in the process for obtaining Rasch 
model equating constants for the Full Design is to adjust 
the preliminary equating constants for^theik tests to 
the base test. To do this, the preliminary equating 

15 ■ - ' ■ 



constant of the base test is subtracted fa?on each of the 
k prelininary equating constants for the tests in the 
study*. This will yield k Kasch model test equating 
constants, C.'s, that cay be used, .to eqhate the ability 
estimates of the base test to the abili-tjr estimates that 
would have been obtained on the scale of ability tha-^ is 
associated with any other test in the study. 

' ^Sour standardized reading vocabulary tests appro- 
priate for fifth grade students were selected to 
illustrate the ^p2?ocedure associated with each design in 
this study* These four tests were taken from the follow- 
ing reading achievement batireries: (1) California 
Achievement Tests (1970) — Reading, Form A, Level 3 
(CAT A3);. (2) Iowa Test of Basic Skills (1970), Form S, 
Level 11 (ITBS 5, 11); (3) Metropblitan Reading Tests 
(1970), Form F, Intermediate Level (MAT JI); and M 
SRA Achievement Series (1971), Form S, Blue Edition 
(SRA EB). The CAT A3 was selected as the base test in 
all of the present illustrations. Using the data 
collected on these four tests in the Anchor* ^est Study^ 
random samples of 500 subjects were drawn from each 
test pair cell in -the multiple test equating jitatrix. 
Rasch analyses were pei?formed for each test pair. Using 
the item difficulty method, an estimate of ilhe mean item 
easiness was obtained for tjie set of items in each test* 

16 • 



Table 2 shows the cell equating constants and the 
intermediate va^es used to obtain the four final l^ch 
Hodel test equating constants. This illustrates the 
procedures for equating nultipie tests in a Full Design. 
The^e equating constants are values that would be, added 
to "fhe ability estimate corresponding to a raw score • 
that was obteiined on the base test to determine the 
equivalent ability (equated' score) on a second test* 



Insei»t Table 2 about here 



The Chain Design for Multiple Test gquating 

The C5iain Design for multiple te^t equating is 
defined as an equating design in which adjacent test 
pair combinations in the ^jiultiple test equating matrix 
are linked. The^ k tests in .the multiple test equating 
^udy are numbered from 1 to k with the base test begin- 
ning the series. For the Chain Design, the cells in 
the multiple test equating matrix that are u^ed in the 
equating study are T^g' ^23 ' ^3if '•^k-i^k cells' 
of the counterbalanced test orders of these 'test pair 
combinations" located below the diagonal in the nultipie 
test equa-txng matrix. For k tests ^ the Chain Design 
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reqi^ires that data for 3c ^ k - 2 test pair coinbijiatioiis 
he a3siinis?:€rr€d and used to estiraate the Rasch snodel 
test equating constants. Figure 2 illustrates irhe Chain 
Desigi an t:he laultriple trest equating matrix for k tests. 

• Insert Figure 2 about here 



To estimate the final Rasch model test equating 
constants for the Chain Design, the researcher may use 
either the item difficulty method or the ability method 
to obtain the initial estimates of the difference in 
.the scale origins of a tesJfc pair in a ceil in the design* 
MES^IMAX analyses are performed for each of the k > k - 2 
cells^ in the Chain Design- For each test pair, ^e 
average of the t;ist that was administered second in the 
test pair is subtracted from the avera^fe of tjie test 
that was administered first in the test pair. These 
differences in the averages are Rasch model cell equating 
constants, denoted by o^a^ where i corresponds to the 
index of the test that was a^inistered first in the 
test pair and j corresponds to the index of the test ^ 
that was administered second in the test pair. The next 
step in the equating proces^s is to organise the c - - ^s 
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into their appropriate ceils in the jnulxiple test equat- 
ing natrix, / 

To obtain a single Rasch model equating constant for 
each test in the Chain Design^ the*two cell equating 
constants for a test pair coinbination, and c^^j 

coiabined to obtain a preliminaaTy equating, constant for 
each of the k tests in the Chain Sesign. To do this, the 
signs of the test pair cell equating constants f oj- the 
cells above the diagonal of the^"Jiultiple te^st equating 
matrix are reversed. The average of the two cell equat- 
ing constants for a test pair in the Chain Design , 
(c^j + Cj^)/2, are the preli^ina3?y equating constants, 
c-'s, for the Chain Design. 

The last ^tep in the process for obtaining Rasch 
juodel equating constants for each of the tests' in the 
Chain Design adjusted to the base test, consists of 
adding together all of the preliminary equating constants 
for the tests that link a particular test"^£t/the base 
test. Thus, the final Rasch model test equating constant 
for test i, C^, is the sum of c"^, ^(12)'***^!' 
This will ,yield k Rasch model test/ equating constants, 
C^'s, that may be used to equeffe .ifehe ability* estimates 
of the base test to the ability estimates that would have 
been obtained on the scale of ability that is associated 
with any o'ther test in the study. ^ . 
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Table 3 presents the cell equating constants for 
the four tests in a Chain Design. Each of the constants 
may be added to the ability estimates of the base* test 

to determine the equivalent abilities on the equated 

» 

•tests. 

r 

/ 

Insert Table 3 about here 



The Vector Design for Multiple Test Equating 

The Vector Design for multiple test equating is 
defined as an equating design in which all tests in the 
equating study are adndnistered in combination with the 
base test. In the multiple test equating matrix^ the 
base test appears in the test pair combinations of "the 
f irst 3?w and the firs^^ of The^lnaJtrxx* For the 

Vector Design for k tests , the cells in the multiple 
test equating -matrix that are used in the equating 

study are T^2^/?l3^ '^1^* ' * ' ^iv ^^^ '^^^^ cells of 
the counterbalanced test orders of these test pair corabi 
nat^ions. For k tests, the Vector Design requires that 
data for k + k - 2 test pair combinations be used to 
estimate thfe Rasch modeLtest equating constants* 
Figure 3 illustrates^ the ^Chain Be^gn in the multiple 
test equating matrix, for k tests • ' ^ 

20 
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I Insert Figure 3' about here 

» 

To estimate the final Hasch model test equating 

constants for the Vector Design, the researcher may use 

. either the item difficulty method or the ability method 

to obtain, the initial estimates of the difference in 

the scale origins of a test pair in a cell in the 'design* 

MESAMAX analyses are performed for each of the.k + k - 2 

"cells in the Vector Design. For each test pair, the 

« 

average for the test that was administered second is 
subtracted from the average. -of,J:he test that was adminis- 
tered first in the test pair. These differences in the 
av6r4ges are the Rasch jnod^l cell .equating cohstajits, , 
denoted by c..'^, where i corresponds to irhe index of 
the test that was administered first in irhe test pair 
and j corresponds to the index -of the test that was 
administered second in the test pair* Th€^ next sf ep" in 
the equating process is to organize the c^j ' s into their 
apprbpriate ceils in the multip^-e test equating matrix. 

To obtain a single Rasch model equating constant 
for a test in the Vector Design, say test i, the two 
cell equating constants for the test pair combination, * 
c^£ and c^j^, are combined to obtain the final Rasch model 
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. test equating constant for test i- This :^rS done for 
each test in the study* To combine the c^j's for a test 
pair 5 the signs of the test pair cell equating constants 
for the cells in the row vector of the design are 
reversed. The average of the two cell equatxng constants 
for, a test pair in the Vector Design, (c^,^ 4- ^^^^^^ 
the final Rasch model test equating constants, Cj[*s, 
that ni^y be used to equate the ability estimates of tlie 
base test to the ability estimates that would have been ^ - 

obtained on the scale of ability associated with any 
other test in the study. 

Table '4 presents the data <Jn the four reading' vdcabu-^ 
lary tests in a Vector Uesign. As in the previous two 
multiple test equating designs, the final equating 

constant for a particular test is the value to be added 

« 

to the Rasch ability, estimate based on the ability scale 
of the base test (CAT A3) to obtain the equivalent 
estimate on the ability scale of the particular test. 



Insert Table ^ about here 
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According ±o Angoff .(1971), Donlon and Angoff (1971) 
and Eentz^ et al, (1975), the major source of error in 
test equating is the unreliability of the test data, i.e. 
the standard error of measurenieht. An analysis of the^ 
estimated error in equating in the ATS found that the 
errors of equating vould be a trival factor relative to 
-the error of measurement. 

Rentz, et al. (1975), have identified three sources 
ov error that are associated with' test equating with the , 
Rasch model: (1) the error of measurement,, (2) the error 
of the equating constants, and (.3) the "assignment" error. 
For Rlsch calibrated 'tests , the standard error of measure- 
ment appears to be approximately 0.2 or more log ability 
units or 2.5 to 3.5 ajaw score units for typical length 
tests. 

The second source of error In Rasch model test 

equating is associated with the equating constcints. 

Depending on tKe equating design, the data manipulation 

procedures, and the number of tests, the 'estimates of 

the various errorg of the equating constants are hased 

ft. ' \ 

on the standard errors of item easiness estimates and, 
the forjnula for the addition of uhcorrelated variance, 

f 

23 ' 
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The IffiSAMAX computer program provides the standard 
errors associated with each item easiness estimate. The 
xirst step vjas to obtain the average variance error of 
the items in a test. This yielded , an average errpr for 
each test in a test pair ^combination. The next step wa^ 
to add the two variance errors for a test pair to obtain 
a variance e3?ror associated with the cell equating 
constant. To obtain the variance errors associated with ' 
the final equating constants, the variance error for 
each cell was combined in the same order as were the cell 
equaling constants. Finally, the standard error for .each 
final eqiiating constant was Obtained by taking the^ square 
root of the variance error associated with the final 
equating constants. The standard errors associated with 
the final Rasch model test equating constants for each 
design are presented in Table 5. • 



Insert Table 5 about here 



The errors associated with the equating constants 
were minor in comparison to other types of errors. For 
the Rasch Projec-^ the variance error of equating constants 
was in the order of 0.02/ log ability units, or* 
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approxiinately ten to fifteen percent of the ytandard 



e3?ror of measurement. This agreed with the estimates 
of the standard errors* of equating provided by Donlon 
and'Angoff (1971) for the SAT* ^ - 

The third source of error, the assignment error, 
is associated with the assignment of equivalent raw 
s<?ores on two tests* <If , a common JLog ability scale was 
used^ in reporting" equivalent scores , there would 'be no 
assignment error. 
. - Thus^ the major source of terror in test equating 

is still the error of measurement of the raw data* The 
seconji largest source x>f .error in the Rasch Project vjas 
^h^ assignment error which could be eliminated by 
" calibrating V itests on a single log ability reference 
scale as opposed to raw-sco re-to-raw^score equating* Of 
the three sources of error in the Rasch model equating, 
" the errors associated with the equating constants are 
minimal.. Estimation .procedures for the variance errors 
of the equating constants are presented in "the Rasch 
Project final report. 
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Discussion and Summary 

The purpose of this stuidiy wa^ to develop and illus- 
trate the procedures for obtaining Rasch model test, 
equating constats for "three multiple test equating 
designs. A multiple test equating matrix was defined 
and -three multiple test equating designs. Full, Chain, 
and Vector,' were described in terms of the matrix. The 
procedures for obtaining Rasch model test equatiijg 
constants, were <ielineated in:, a general form for each 

design. The Full Desigh closely -parallels the design ^: 

that was employed in the ATS and Rasch Project. This ^ 

design included all possible combinations of two tests 

and the counte'rbalanced testing order combinations* 

ObvioujJly this design would be the preferred desigh fbr 

equating a set of k. tests since it dontaitis the maximum 

information in terms of test pair combinations • \ But 

the Full Design i?equires large numbers of subjects anfl 

research funds as the number of tests included in ffie 

Study increases. Also, all of the tests included in 

the Full Desigh must be appropriate for a comm9n level 

or age group. ^ ^ ^ ^ 

The Chain Design and. Vectbr Design provide alterjia- 

tives for the researcher who is limited in his tjesOurces 
# * 

The Chain Design is particularly useful in a setting ^ 
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where the tests in the equating study are sequential in 
the levels or age groups for which they are appropriate, 
e^g*, the Sequential Ifests^of Educational Progress (STEP)- 
The Vector Design is useful in, a setting sinilar to the 
Full pesign where United resources are available and 
the tests in the equating study are appropriate xor a 
comzicn level or age group* I 

The investigation of the use of the Rasch mbdel in 
the area pf test equating began with Panchapakesan in 
1969 and' reached a inajor point with the Rasch Project in 
1975* The simplicity of the Rasch niodel is character- 
ized by the fact that only a single value, a test equat- 
ing constant, is refitaired to adjust the scale of one 
test to the sca^e of a second test. The present study 
has provided the general procedures and examples for 
obtaining the Rasch jnodel test equating -^constants for 
three multiple test equating designs. These procedures 
are adaptable to any number of tests that might be 
included in a suiltiple test* equating study. 

To illustrate the procedures for obtaining the 

Rasch model test equating constants, multiple test equat- 

xngs Qf four tests were performed. Random samples of 
# 

500 cases of test pair data for each cell in the multiple 
test equating matrix were drawn from the data collected 
for the ATS and Rasch Project. A set of four Rasch model 
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-test equating constants vere obtained for each of i:he*^ 

designs in the present study* Since" each design has 

its own utility, no statistical coaparisons of the sets 

of constants across designs we3?e appropriate 

Angoff {1971), Donlcn and Angoff <1S71) and Bentz, 

et al. (1S75) have pointed, out^ thai: 'the siajl^ source of 

error in test equating is tne usual standard error of 

measurement due to the unreliability of the individual 

tests. However, the error associated with the equating 

constants must be examined • *\dapting the procedures 

developed by Hentz, et al.* (1975) to ^the three designs 

in tlie present study, standard errors of the equating 

constants were obtained for each <fesign with 500 cases 

of data in each cell. Table 5 presented the standard 

errors of the equating constants. Using the procedure 

lor obtaining crude estimates provided By Eentz, et al* 

(1975), the standard errors of the equating constants 

* 

for the Full Design with four tests was .0106. '^The 
standard errors reported for the Full Desigii are apLl 
within .0006 of the crude estimate*- For the Chain Design 
the crude estimates for the standard e3?rors were .0173 
for the first linked test equating constcmt (ITSS 5,11),, 
.02*15 for tJie second linked test equating constant 
(MAT FX), and .0300 for the third linked test equating - 
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constant (SRA EB). All of the crude estinates of the 

standa3?d errors for the constants in the Chain Design 

were within .&008 of the reported values for the standard 

errors of the equating constants for the Chain Design in 

r 

Table 5. For the Vector Design the crude estimate of 
the standard ei?ror of- the equating constants was .0173^ 
and all <xx l:he reported values are within .0011 of the 

crude estinate* For the Ch^n and Vector Designs, the 

■* 
4 

standard error of the equatxng constant for the base test 
is zero since no estimates are directly obtained for the 
equating constant for the base test. 

From an examination of the standard errors of the 
equating constants reported in Table 5, it is obvious^ 
that the standard errors are smaller for th^ constants 
based on the Full Design. The standard errors of the 
equating constants for the Vector Design are only slightly 
larger than those for the Full Design. However, the ^ 
standard errors of the equating constants based on the 
Chain Design increase as the number of links in the chain 
between the base test and the test to be equated increase. 
Employing the na^ defined by Donlon and Angof f (1971) *as 
the increase in the equating error and using the standard 
error of the equating constant for ITBS 5^11 of .0179, 
the expected standard error of the second link would be 
.02U5 and for the third link would be .0310. ^ The standard 

29 



ERIC 



28 

error value obtained for MAT FI was ,0240 and for SRA EB 
was •0290. Considering the concern generated by the 
continuing drop in the national no3?ns on the SAT in 
recent years and the nlinber of links in the equating of 
each new fora to the original 19U1 noa?saatiye or inference 
Torn of the test, a strong possible explanation for the 
drop nay be tied to the equating error in each new f om 
of the test. 

The present research has provided those interested 
in test equating the procedures for obtaining J^asch 
int>del test equating constants for three multiple test 
equating designs. The next step in the application of 
the Rasch model to the test equating domain is in the 
area of test caliLration of multiple tests to a cccunon 
reference scale. Calibration would seem a preferable 
process to equating if for no other i?eason than the 
p6tential of reducing the errors associated with raw- 
score-to-raw -score equating, i.e., the assignnjent ei»ror, 
and the errors associated with the equating constants. 
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Symbols and Definilrions 



, the jixmhei? of tests to be equated 



' ' the ^ftv^nber of iteisis on test i 

T..' • test pair (i^j) with test i 

"^^^ adjain543tez?ed first 



. Rasch jnodel ce^ll equating^ constant 

■^3 foT equating test i to test j y ^/ith 

test i administered £irst 



c. ^ ith rc^ narginai mean of the c. . 's 

. jth coltamn marginal jaean of the <2^j'^ 

c. ^ Rasch*iiiodel prelijuinaaTy equating 
^ constant for test i 



C. Final. Rasch model equating constant 

for test i 
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2 


3 




k-1 


k 








^12 


^13 . 
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^21 
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T 

-^31 

• 
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^k2 


v 
. -^kS 


• * 
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Figure 1. A Multiple Test Equating Matrix for K Tests 
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Table 2 






** 


* 


* 


Cell Equating Constat 5 for the Pull Design 

* * - 


_ 


- 


Tests- 


'CAT A3 


ITBS S,ll 


HAT.FI 


SEA EB 






CAT A3. . 


0 


.S73 


.JJ20 


-.ISO- 


.235S 




ITBS S 11 


--585 


0 


-.113 


-.U9S 






MAT FI 


-.458 ; 


.225 


0 


- -.363 


— • i*TOO 




SRA EB 


.007 


.559 




0 


.21*70 






-.2590 


-36»tS 


-1831^ 


* . -.2523 






X 


.21*73 . 


-.3315 


-.1655 


V2i*96 






X 


0 


-.579 


.1*13 


.002 







m * 

I 



Tests 1 2 3 U - - ' k-1 k 

r -12 

2 • T T 

- 21 23 - ^ 

^ ^32 ^SJi 

U3 



k-1 



-k-l,k 



Jc,k-1 , 



Figure 2. A Chain Design for Multiple Test Equating for K Tests 



34 



ERIC 



Table 3 

Cell Equating Constants in, the Chain Design 

* 

Tests CAT A3 ITBS 5,11 HAT ?I SRA EB 

CAT A3 .673 . ^ 

ITBS 5,11 -.585 -.JL13 

' liAT FI - -226 ' - -.363 

SRA EB ■ - • .1*22 



C. 0 -.629 .J.695 . .3925 

Q,^ 0 -.629 .-.U60 -.067 
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Figure 3. A Vector Design for Multiple Test Equating for 
K Tests 
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Table h 

Cell Equating Constants for the Vector Design 



Tests 


CAT A3 


ITBS 5,11 


MAT FI 


SRA EB 


CAT A3 




.673 


.U20 


-.150 


ITBS 5;ii 


-.584 " 








MAT FI 










SRA EB 


.007 








c. 

1 




-.629 


-.i}39 


-.079 
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Table 5 



Standard Errors of the Equating Constants for 
each Multiply Test Equating Design 



Design 



Standard Error of The Equating Constant 
CAT A3 ITBS 5,11 MAT FI S^A EB 



Pull 

Chain 

Vector 



,0111 
OJOOO^ 



i'OOOO* 



.0105 
.0179 
.0179 



,0108 
.02^0 
.dl8i* 



.0109 
\0292 
.0182* 



\ 



^Since no direct Operations, were performed to estimate the 
Rasch model test equating constant ^oi\±he l>ase *test in 
the^hain and Vector Designs, tfe standard error of the ; 
equating constant for the bas^test is •0000. 
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